mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters

نویسندگان

  • Hákon Jónsson
  • Aurelien Ginolhac
  • Mikkel Schubert
  • Philip L. F. Johnson
  • Ludovic Orlando
چکیده

MOTIVATION Ancient DNA (aDNA) molecules in fossilized bones and teeth, coprolites, sediments, mummified specimens and museum collections represent fantastic sources of information for evolutionary biologists, revealing the agents of past epidemics and the dynamics of past populations. However, the analysis of aDNA generally faces two major issues. Firstly, sequences consist of a mixture of endogenous and various exogenous backgrounds, mostly microbial. Secondly, high nucleotide misincorporation rates can be observed as a result of severe post-mortem DNA damage. Such misincorporation patterns are instrumental to authenticate ancient sequences versus modern contaminants. We recently developed the user-friendly mapDamage package that identifies such patterns from next-generation sequencing (NGS) sequence datasets. The absence of formal statistical modeling of the DNA damage process, however, precluded rigorous quantitative comparisons across samples. RESULTS Here, we describe mapDamage 2.0 that extends the original features of mapDamage by incorporating a statistical model of DNA damage. Assuming that damage events depend only on sequencing position and post-mortem deamination, our Bayesian statistical framework provides estimates of four key features of aDNA molecules: the average length of overhangs (λ), nick frequency (ν) and cytosine deamination rates in both double-stranded regions ( ) and overhangs ( ). Our model enables rescaling base quality scores according to their probability of being damaged. mapDamage 2.0 handles NGS datasets with ease and is compatible with a wide range of DNA library protocols. AVAILABILITY mapDamage 2.0 is available at ginolhac.github.io/mapDamage/ as a Python package and documentation is maintained at the Centre for GeoGenetics Web site (geogenetics.ku.dk/publications/mapdamage2.0/). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of Estimates Using Record Statistics from Lomax Model: Bayesian and Non Bayesian Approaches

This paper address the problem of Bayesian estimation of the parameters, reliability and hazard function in the context of record statistics values from the two-parameter Lomax distribution. The ML and the Bayes estimates based on records are derived for the two unknown parameters and the survival time parameters, reliability and hazard functions. The Bayes estimates are obtained based on conju...

متن کامل

Bayesian estimation of sequence damage in ancient DNA.

DNA extracted from archaeological and paleontological remains is usually damaged by biochemical processes postmortem. Some of these processes lead to changes in the structure of the DNA molecule, which can result in the incorporation of incorrect nucleotides during polymerase chain reaction. These base misincorporations, or miscoding lesions, can lead to the inclusion of spurious additional mut...

متن کامل

Accommodating the effect of ancient DNA damage on inferences of demographic histories.

DNA sequences extracted from ancient remains are increasingly used to generate large population data sets, often spanning tens of thousands of years of population history. Bayesian coalescent methods such as those implemented in the software package BEAST can be used to estimate the demographic history of these populations, sometimes resulting in complex scenarios of fluctuations in population ...

متن کامل

Bayesian Estimation of the Timing and Severity of a Population Bottleneck from Ancient DNA

In this first application of the approximate Bayesian computation approach using the serial coalescent, we demonstrated the estimation of historical demographic parameters from ancient DNA. We estimated the timing and severity of a population bottleneck in an endemic subterranean rodent, Ctenomys sociabilis, over the last 10,000 y from two cave sites in northern Patagonia, Argentina. Understand...

متن کامل

Back to BaySICS: A User-Friendly Program for Bayesian Statistical Inference from Coalescent Simulations

Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC anal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2013